I'l.Vl.t'ib*  rt  Vi Ii'l  I  <.t‘M  ».t  >**-*  »-♦ 


AD-A19S  321 


_ O/IC  fh  f  uit-j 

fcC»L  fMTtOfi  i^«<«  ^nt«r« '/^*1** 


MENTATION  PAGE 


READ  INSTRUCTIONS 
IIEI-OKK  COMPLETING  FORM 


2.  COVT  ACCESSION  NO. I  1.  RECIPIENT’*  CATALOG  NUMBER 


4:  TITLE  ('•n<<  SuMIU) 


Weighted  and  clouded  distributions 


t.  AUTHOR  (•) 


C.  Radhakrishna  Rao 


»  TYRE  OF  REPORT  •  PERIOD  COVERED 

T«o&»£e<rl  -  February  1988 


•  PERFORMING  ONG.  REPORT  NUMBER 
88-01 


•  ■  CONTRACT^OR_C^tApT  HUMBERT  «J 

Grant  Af§G-88-0030 


performing  organization  name  and  aooress 

Center  for  Multivariate  Analysis 

Fifth  floor  -  Thackeray  Hall 

University  of  Pittsburgh,  Pittsburgh,  PA  1526 


n.  CON T ROLLING  office  NAME  and  aoonmk  II.  REPORT  PATE 

Ai r  Force  Office  of  Scientific  R^se^r^  February  1988 

Department  of  the  Air  Force  A6M  d  ‘ a  ~ir  number  J  rageI - 

Bolling  Air  Force  Base,  DC  20332  Q po  28 


U.  MONITORING  AGENCY  name  i  ADDRESS (II  dl/loront  Irom  Controlling  OUtco)  II.  SECURITY  CLASS,  (ol  thl»  tfpoft) 


SQOr X2.'  i  ' 


Unclassified 


II*.  Ot'CL  ASSIFIC  ATION/OOWNGRADING 
SCHEOULE 


II.  Di  ST  HI  BU  TlON  STATEMENT  (ol  Ihlo  Report) 

Approved  for  public  release;  distribution  unlimited. 


17.  DISTRIBUTION  STATEMENT  (ol  fft*  abetted  on to  tod  In  Block  20,  II  tHllotoni  Iron i  Ropott) 


p.V  f  ^ 

AUG  2  5  1988 


19  KEY  WORDS  f Continue  on  tovotoo  ot 4o  II  nooooomry  arid  Identity  by  block  numbor) 


damage  model,  probability  sampling,  quadrat  sampling,  size  biased  sampling, 
truncation,  weighted  distribution 


20  ABSTRACT  fCnnflnu*  on  rovoroo  old#  II  neceeeoty  o nd  Identity  by  bloc*  flumorrj 

The  concept  of  weighted  distributions  can  be  traced  to  the  study  of 

effects  of  methods  of  ascertainment  upon  the  estimation  of  frequencies  by 

Fisher  in  1934.  It  was  formulated  in  general  terms  by  the  author  in  a 

Continued-- 


oo,',r..i4B8  8  25  13 


Unclassified 

SECURITY  CLASSIFICATION  OF  THIl  PAGE  D»(«  Enl;»d) 


m 

■Sfts! 


1 

x§S: 
TO 


TO 


Rv! 

P»'i,i,i  »* 


m 


sae; 

II 


m 

A 


m 


I 

Hhroi 


Unclassified - —  •  .  _ 

SECURITY  *CL  ASSlTlC  ATION  Of'tHIS  ^A&E{W««o  Dt  la  BnltfJJ 


•20  Abstract  (continued) 


paper  presented  at  the  First  International  Symposium  on  Classical  and 
Contagious  Distributions  held  in  Montreal  in  1963.  Since  then  a  number  of 
papers  have  appeared  on  the  subject.  This  article  reviews  the  previous 
work  and  the  current  developments  with  some  examples. 

Weighted  distributions  occur  in  a  natural  way  when  adjustments  have 
to  be  made  in  the  original  probability  distribution  due  to  deviations  from 
simple  random  sampling  in  collecting  data,  as  when  the  events  that  occur 
do  not  have  the  same  chance  of  coming  into  the  sample.  The  examples  include: 
p.p.s.  (probability  proportional  to  size)  sampling  in  sample  surveys,  damage 
models,  visibility  bias  in  quadrat  sampling  in  ecological  studies,  sampling 
through  effected  individuals  in  genetic  studies,  waiting  time  paradox  and  so  on. 


Unclassified 


tlCuNITV  CLAUDICATION  or  TUI*  rAQtf»»ian  Dala  «nlara<) 


WEIGHTED  AND  CLOUDED  DISTRIBUTIONS* 


C.  Radhakrishna  Rao 

University  of  Pittsburgh 
Pittsburgh,  PA  15260 

Technical  Report  No.  88-01 


February  1988 


Accession  For 

- - -  | 

Center  for  Multivariate  Analysis 

5th  Floor  Thackeray  Hall 
University  of  Pittsburgh 

NTIS  GRA&I  it f 

DTIC  TAB 

Unannounced  □ 

Justification 

Pittsburgh,  PA  15260 

By 

Distribution/ 


Availability  Codes 


*  Research  sponsored  by  the  Air  Force  Office  of  Scientific  Research  under 
Grant  AFS0-88-0030.  The  United  States  Government  is  authorized  to  reproduce 
and  distribute  reprints  for  governmental  purposes  notwithstanding  any  copy¬ 
right  notation  hereon. 


WEIGHTED  AND  CLOUDED  DISTRIBUTIONS* 


C.  Radhakrishna  Rao 

University  of  Pittsburgh 
Pittsburgh,  PA  15260 

ABSTRACT 

The  concept  of  weighted  distributions  can  be  traced  to  the  study  of 
effects  of  methods  of  ascertainment  upon  the  estimation  of  frequencies  by 
Fisher  in  1934.  It  was  formulated  in  general  terms  by  the  author  in  a 
paper  presented  at  the  First  International  Symposium  on  Classical  and 
Contagious  Distributions  held  in  Montreal  in  1963.  Since  then  a  number  of 
papers  have  appeared  on  the  subject.  This  article  reviews  the  previous 
work  and  the  current  developments  with  some  examples. 

Weighted  distributions  occur  in  a  natural  way  when  adjustments  have 
to  be  made  in  the  original  probability  distribution  due  to  deviations  from 
simple  random  sampling  in  collecting  data,  as  when  the  events  that  occur 
do  not  have  the  same  chance  of  coming  into  the  sample.  The  examples  include: 
p.p.s.  (probability  proportional  to  size)  sampling  in  sample  surveys,  damage 
models,  visibility  bias  in  quadrat  sampling  in  ecological  studies,  sampling 

through  effected  individuals  in  genetic  studies,  waiting  time  paradox  and  so  on. 

'\  /  '■  ” 

ti  «.>  ' v  r  "X.  '"i-c  > . _ _ 

AMS  ~1980  Subject  Classifications:  Primary  60E05,  secondary  62D05. 

Key  words  and  phrases:  damage  model,  probability  sampling,  quadrat  sampling, 
size  biased  sampling,  truncation,  weighted  distribution. 

Research  sponsored  by  the  Air  Force  Office  of  Scientific  Research  under 
Grant  AFSQ-88-0030.  The  United  States  Government  is  authorized  to  reproduce 
and  distribute  reprints  for  governmental  purposes  notwithstanding  any  copy¬ 
right  notation  hereon. 


DEDICATION 


Professor  K.  Nagabhushanam  was  one  of  the  two  inspiring  teachers  I 
had  when  I  was  pursuing  my  graduate  studies  in  mathematics  at  the  Andhra 
University,  Waltair.  He  and  Professor  V.  Ramaswamy  not  only  taught  us 
mathematics  but  also  prepared  us  to  think  in  terms  of  mathematics  and  to 
use  mathematics  as  an  abstract  logical  method  in  solving  complex  problems 
in  any  field  of  inquiry.  This  training  was  a  great  asset  to  me  when  I 
started  on  my  research  career.  I  had  kept  in  touch  with  Professor 
Nagabhushanam  after  I  left  Andhra  University  as  he  was  keenly  interested 
in  my  activities  and  often  encouraged  me  in  my  research  work.  It  is, 
indeed,  a  great  honor  to  contribute  to  the  memorial  volume  of  my  respected 
teacher.  The  contents  of  this  chapter  are  specifically  addressed  to  the 
students  and  teachers  of  statistics  who  are  looking  for  simple  examples 
to  demonstrate  some  natural  pitfalls  in  statistical  data  analysis  and 
inference. 


1.  INTRODUCTION 


In  statistical  inference,  i.e.,  making  statements  about  a  population 
on  the  basis  of  a  sample  drawn  from  it,  it  is  necessary  to  identify  the 
holy  trinity,  viz.,  the  sample  space  ft,  Bore!  field  of  sets  F  defined  on  ft 
and  a  family  of  probability  measures  P  defined  on  F.  Statistical  analysis 
is  concerned  with  setting  up  a  correspondence  between  a  sample  (a  member 
of  ft)  and  an  element  (or  subset  of  elements)  of  P.  An  important  part  of 
the  trinity  is  the  specification  P.  Wrong  specification  may  lead  to  wrong 
inference,  which  is  sometimes  called  the  third  kind  of  error  in  statistical 
parlance. 

The  problem  of  specification  is  not  a  simple  one.  A  detailed  knowledge 
of  the  procedure  actually  employed  in  acquiring  data  is  an  essential  in¬ 
gredient  in  arriving  at  a  proper  specification.  The  situation  is  more  com¬ 
plicated  with  field  observations  and  nonexperimental  data,  where  nature 
produces  events  according  to  a  certain  stochastic  model,  and  the  events  are 
observed  and  recorded  by  field  investigators.  There  does  not  always  exist 
a  suitable  sampling  frame  for  designing  a  sample  survey  to  ensure  that  the 
events  which  occur  have  specified  (usually  equal)  chances  of  coming  into  the 
sample.  In  practice,  all  the  events  that  occur  in  nature  cannot  be  brought 
into  the  sample  frame.  For  instance,  certain  events  may  not  be  observable 
and  therefore  missed  in  the  record.  This  gives  rise  to  what  are  called 
truncated,  censored  or  incomplete  samples.  Or  an  event  that  has  occurred 
may  be  observable  only  with  a  certain  probability  depending  on  the  nature 
of  the  event,  such  as  its  conspicuousness  and  the  procedure  employed 
to  observe  it,  resulting  in  unequal  probability  sampling.  Or  an  event  which 
has  occurred  may  change  in  a  random  way  by  the  time  or  during  the  process 
of  observation  so  that  what  comes  on  record  is  a  modified  event,  in  which 


LwmmKU*Y 


2 

case  the  change  or  damage  has  to  be  appropriately  modeled  for  statistical 
analysis.  Sometimes,  events  from  two  or  more  different  sources  having 
different  stochastic  mechanisms  may  get  mixed  up  and  brought  into  the  same 
record,  resulting  in  contaminated  samples.  In  all  of  these  cases,  the 
specification  for  the  original  events  (as  they  occur)  may  not  be  appropriate 
for  the  events  as  they  are  recorded  (observed  data)  unless  it  is  suitably 
modified.  Examples  of  such  situations  are  given  in  Rao  (1965,  1975,  1985). 

In  a  classical  paper,  Fisher  (1934)  demonstrated  the  need  for  such  an 
adjustment  in  specification  depending  on  the  way  data  are  ascertained.  The 
author  extended  the  basic  ideas  of  Fisher  in  Rao  (1965)  and  developed  the 
theory  of  what  are  called  weighted  distributions  as  a  method  of  adjustment 
applicable  to  many  situations.  In  this  paper  we  discuss  the  general  theory 
and  some  recent  developments  through  some  examples. 

2.  TRUNCATION 

Some  events,  although  they  occur,  may  be  unascertainable,  so  that  the 
observed  distribution  is  truncated  to  a  certain  region  of  the  sample  space. 
For  instance,  if  we  are  investigating  the  distribution  of  the  number  of  eggs 
laid  by  an  insect,  the  frequency  of  zero  eggs  is  not  ascertainable.  Another 
example  is  the  frequency  of  families  where  both  parents  are  heterozygous  for 
albinism  but  have  no  albino  children.  There  is  no  evidence  that  the  parents 
are  heterozygous  unless  they  have  an  albino  child,  and  the  families  with 
such  parents  and  having  no  albino  children  get  confounded  with  normal  fami¬ 
lies  having  no  children.  The  actual  frequency  of  the  event  zero  albino 
children  is  thus  not  ascertainable. 

In  general,  if  p(x,e)  is  the  p.d.f.  (probability  density  function  for  a 
continuous  variable  or  probability  for  a  discrete  variable),  where  e  denotes 
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an  unknown  parameter,  and  the  random  variable  X  is  truncated  to  a  specified 


region  T <=  a  of  the  sample  space,  then  the  p.d.f.  of  the  truncated  random  vari¬ 
able  X^is 


nT/,f  n%  _  w(x,T)p(x,e) 

p  U’e) - StTTel 


(2.1) 


where  w(x,T)  =1  if  x  e  T  and  =  0  if  x  $  T,  and  u(T,0)  =  E[w(X,T)].  The 
expression  (2.1)  is  the  original  probability  density  weighted  by  a  suitable 


function,  and  it  provides  a  simple  example  of  a  weighted  probability  distri¬ 
bution  whose  general  definition  is  given  in  the  next  section. 

Suppose  the  event  zero  is  not  observable  in  sampling  from  a  binomial 
distribution  with  index  n  and  probability  of  success  n.  Let  RT  denote  the 
TB  (truncated  binomial)  random  variable.  Then 

/n,  r,,  ,n-r 
r  (Jir  (l-11) 

P(R  =(*)=— - —  ,  r  =  1 .  ,n.  (2.2) 

1  •  (1-.)" 

For  such  a  distribution 


E(R)  = 


and 

(1-*)" 


(!-»)" 


(2.3) 


which  are  somewhat  larger  than  those  for  a  complete  binomial,  for  which  the 
above  values  are  nir  and  tt  respectively. 

The  following  data  relate  to  the  numbers  of  brothers  and  sisters  in  fami¬ 
lies  of  the  girls  whose  names  were  found  in  a  private  telephone  notebook  of 


a  European  professor.  (The  first  number  within  the  brackets  gives  the  number 
of  sisters  including  the  respondent  and  the  second  number,  that  of  her  brothers. 

(1,0),  (1,0),  (1,1),  (1,1),  (1,1),  (1,1),  (1,1),  (1,1),  (1,1),  (1,1) 

(1,1),  (2,0),  (2,0),  (2,0),  (2,1),  (2,1),  (2,1),  (2,1),  (1,2),  (1,2)  (2.4) 

(3,0),  (3,1),  (3,1),  (1,3), (1,3), (4,0),  (4,1), (1,4) 


Since  at  least  one  girl  is  present  in  the  family,  we  may  try  and  see 
whether  the  data  conform  to  a  TB  distribution  with  the  observation  on  zero 
sisters  missing.  The  expected  number  of  girls  under  this  hypothesis,  assum¬ 
ing  ir  =  0.5,  is 

I  f(n)E(r|n)  (2.5) 

n=l 

where  f(n)  is  the  observed  number  of  families  with  size  n  ( i . e . ,  the  total 
number  of  brothers  and  sisters).  Using  the  formulas  (2.3)  and  (2.5)  and  the 
data  (2.4),  we  have: 


The  observed  figures  seem  to  be  in  good  agreement  with  those  expected  under 
the  hypothesis  of  truncated  binomial.  However,  a  different  story  may  emerge 
in  a  similar  situation  as  in  the  following  data  giving  the  numbers  of  sisters 
and  brothers  in  the  families  of  girl  acquaintances  of  a  male  student  in  Calcutta. 

(2,1),  (1,1),  (3,0),  (2,0),  (3,1),  (1,0),  (2,1),  (1,0),  (1,1),  (1,1).  (2.6) 

The  expected  numbers  of  sisters  under  the  hypothesis  of  truncated  binomial 
is  9.5  (using  the  formulas  (2.3)  and  (2.5))  whereas  the  observed  number  is 
17.  The  truncated  binomial  is  not  appropriate  for  the  data  (2.6)  and  it 
appears  that  the  mechanisms  of  encountering  girls  seem  to  be  different  in 
the  cases  of  the  professor  and  the  student. 

Note  that  if  we  sample  a  number  of  households  in  a  city  and  ascertain 
the  numbers  of  brothers  and  sisters  (i.e.,  sons  and  daughters)  in  each 
household,  then  we  expect  the  number  of  sisters  to  follow  a  complete  bi¬ 
nomial  distribution.  If  from  such  data  we  omit  the  households  which  do  not 


have  girls,  then  the  data  would  follow  a  truncated  binomial  distribution. 
We  shall  see  in  the  next  section  that  a  different  distribution  holds  when 
data  are  ascertained  about  the  sibs  from  a  sample  of  boys  or  girls  one 
encounters.  The  case  of  the  student  seems  tc  fall  in  such  a  category. 


3.  WEIGHTED  DISTRIBUTIONS 


In  Section  2,  we  have  considered  a  situation  where  certain  events  are 
unobservable.  But  a  more  general  rase  is  when  an  event  that  occurs  has  a 
certain  probability  of  being  recorded  (or  included  in  the  sample).  Let  X 
be  a  random  variable  with  p(x,e)  as  the  p.d.f.,  where  e  is  a  parameter,  and 
suppose  that  when  X  =  x  occurs,  the  probability  of  recording  it  is  w (x,«) 
depending  on  the  observed  x  and  possibly  also  on  an  unknown  parameter  ct. 
Then  the  p.d.f.  of  the  resulting  random  variable  Xw  is 


pW(x,e,a 


)  =  w(x,oc)p(x,e) 

'  "  E[w(x,m)]  ’ 


(3.1 


Although  in  deriving  (3.1)  we  chose  w(x,a)  such  that  0  £w(x,a)  £  1,  we 
may  formally  define  (3.1)  for  any  arbitrary  nonnegative  function  w(x,a)  for 
which  E[w(x ,a) ]  exists.  The  p.d.f.  so  obtained  is  called  a  weighted  versio 
of  p(x)  and  denoted  by  pW(x).  In  particular  the  weighted  distribution 


nW(v  fll  =  .f.(x)p(x,9) 

P  ,X'e)  ^(fU») 


(3.2 


where  f(x)  is  some  monotonic  function  of  x,  is  called  a  size  biased  distri¬ 
bution.  When  x  is  univariate  and  nonnegative,  the  weighted  distribution 


pw(x,e) 


xap(x,o) 

E(xa) 


(3.3 


introduced  in  Rao  (1965)  has  found  applications  in  many  practical  problems 
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(see  Rao  (1985)).  When  a  =  1,  it  is  called  a  length  (size)  biased  distribution. 
For  example,  if  X  has  the  logarithmic  series  distribution 


-r  log(l-a) 


r  =  1,2,... 


(3.4) 


then  the  distribution  of  the  length  biased  variable  is 

n-1 


(l-a)a' 


r  =  1,2,... 


w 

which  shows  that  X  -  1  has  a  geometric  distribution.  A  truncated  geometric 
distribution  is  sometimes  found  to  provide  a  good  fit  to  an  observed  distribu¬ 
tion  of  family  size  (Feller,  1968).  But,  if  the  information  on  family  size 
has  been  ascertained  from  school  children,  then  the  observations  may  have  a 
size  biased  distribution.  In  such  a  case,  a  good  fit  of  the  geometric  distribu¬ 
tion  to  the  observed  family  size  would  indicate  that  the  underlying  distribu¬ 
tion  is,  in  fact,  a  logarithmic  series. 

Table  1  gives  a  list  of  some  basic  distributions  and  their  size  biased 
forms.  It  is  seen  that  the  size  biased  form  belongs  to  the  same  family  as  the 
original  distribution  in  all  cases  except  the  logarithmic  series. 

An  extensive  literature  on  weighted  distributions  has  appeared  since 
the  concept  was  formalized  in  Rao  (1965);  it  is  reviewed  with  a  large  number 
of  references  in  a  paper  by  Pa t i 1  (1984)  with  special  reference  to  the  earlier 
contributions  by  Pat i 1  and  Raa  (1977,  1978)  and  Patil  and  Ord  (1976).  Rao 
(1985)  contains  an  updated  review  of  the  previous  work  and  some  new  results. 


Table  I.  Certain  Basic  Distributions  and  Their  Size-Biased  Forms 


Random 
variable  (rv) 

pf  (pdf) 

Size-biased  rv 

Binomial, 

B(n,p) 

l  +B{n-  1  ,p) 

Negative  binomial, 
NB(*,p) 

Cry 

1  +  NB(k  +  1  ,p) 

Poisson, 

Po{X) 

e~xX*/x\ 

1  +  Po(2) 

Logarithmic  series, 
L(a) 

{ — log(l  —  a)}~latx/x 

1  +  NB(l,a) 

Hypergeometric, 

(n\AT(N-  M)n~x 
\x)  N* 

I  +  H{n-\,M-  I ,N-  1) 

Binomial  beta, 
BB(n,  at,  y) 

/n\/?(«  +  x,y  +n-x) 

\xj  P(a,  y) 

1  +  BB(n  -  l,a, y) 

Negative  binomial 
beta,  NBB(fc,a,y) 

A  +  x  -  1  \0(«  +  x,y  +  k ) 

\  x  )  P(a,y) 

1  +NBB(k+  l.a.y) 

Gamma,  G(a,k) 

(*) 

G(<x,k  +  1) 

Beta  first  kind, 
Bt[S,  y) 

x»-'(l-xy-'IP{6,y) 

B,(6+  l.y) 

Beta  second  kind, 

B»{S,y) 

x*-'(\  +  x)-'IP<J,y  -  6) 

*,(*  +  l,y-6-  1) 

Pearson  type  V, 
Pe(*) 

x-»-,exp(-*-,)/r(*) 

Pe(*  -  1) 

Pareto,  Pa(a,  y) 

y«,jc'<7+,,l  x  £  a 

Pa(a,  —  1) 

Lognormal,  /loax  -  mV 

LNOi.er1)  exp  -  I  - - 7^—)  LNO*  +  <rJ,<rJ) 


4.  p.p.s.  SAMPLING 


An  example  of  a  weighted  distribution  arises  in  sample  surveys  when  un¬ 
equal  probability  sampling  or  what  is  known  as  p.p.s.  (probability  propor¬ 
tional  to  size)  sampling  is  employed.  A  general  version  of  the  sampling 
scheme  involves  two  random  varianbes  X  and  Y  with  p.d.f.  p(x,y,e)  and  a  weight 
function  w(y)  which  is  a  function  of  y  only,  giving  a  weighted  p.d.f. 


pw(x,y,e) 


w(y)p(x,,y  ,e) 
E[w(Y)] 


(4.1) 


In  sample  surveys,  we  obtain  observations  on  (XW,YW)  from  the  p.d.f.  (4.1)  and 
draw  inference  on  the  parameter  9. 

w 

It  is  of  interest  to  note  that  the  marginal  p.d.f.  of  X  is 


P  (  '  EMU)] 


(4.2) 


which  is  a  weighted  version  of  p(x,e)  with  the  weight  function 


w(x,e)  = 


p(y(x)w(y)dy. 


(4.3) 


If  we  have  a  sample  of  size  n 

(*1 ,yi ) >  •••*  (x^»y^) 


(4.4) 


from  the  distribution  (4.1),  then  an  estimate  of  E(X),  the  mean  with  respect 
to  the  original  p.d.f.  p(x,y,o),  which  is  the  parameter  of  interest,  is 


£[w(y)]  £ 


xi 


n  i=i  w(yi) 


(4.5) 


which  is  an  unbiased  estimator  of  E(X).  The  estimator 


I*  4J 


i  t.t  «.i  (.1  v-t  ».>' 


DTOm^iwnc'n^  v  xj*.  x.-.  xj 
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would  be  an  unbiased  estimator  of  E(XW),  the  mean  with  respect  to  weighted 
p.d.f.  pw(x,e)  as  in  (4.3). 

5.  WEIGHTED  BINOMIAL  DISTRIBUTION:  TWO  EMPIRICAL  THEOREMS 

Suppose  that  we  ascertain  from  each  male  member  in  a  class  or  in  any 
congregation  the  number  of  brothers  including  himself  and  the  number  of  sisters 
he  has  and  raise  the  following  question.  What  is  the  approximate  value  of 
B/(B+S),  where  B  and  S  are  the  total  numbers  of  brothers  and  sisters  in  all 
the  families  of  the  male  members?  It  is  clear  that  we  are  sampling  from  a 
truncated  distribution  of  families  with  at  least  one  male  member  so  that 
B/(B+S)  should  be  larger  than  one  half.  But  by  how  much?  Surprisingly,  when 
k,  the  number  of  males  asked,  is  not  very  small,  one  can  make  accurate  pre¬ 
dictions  of  the  relative  magnitudes  of  B  and  S,  and  of  the  ratio  B/(B+S). 

This  may  be  stated  in  the  form  of  an  empirical  theorem. 

Empirical  Theorem  1:  Let  k  male  members  observed  in  any  gathering  have 
a  total  number  of  B  brothers  (including  themselves)  and  a  total  number  of  S 
sisters.  Then  the  following  predictions  can  be  made: 

(i)  B  is  much  greater  than  S. 

(ii)  B  -  k  is  approximately  equal  to  S. 

(iii)  B/(B+S)  is  larger  than  one  half.  It  will  be  closer  to 

1  k 

I+  2TB+ST  ’ 

(iv)  ( B- k )/ ( B+S-k )  is  close  to  half. 

The  roles  of  B  and  S  are  reversed  if  the  data  are  ascertained  from  the 


female  members  in  a  gathering. 
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Consider  a  family  with  n  children.  Then  on  the  assumption  of  a  binomial 
distribution  with  it  =  1/2  and  index  n,  the  probability  of  r  male  children  is 


p(r)  =  (")(y)  ,  r- 0,1,2, 


(5.1) 


In  our  case,  there  is  at  least  one  male  child  so  that  the  appropriate 
distribution  is  a  truncated  one.  One  possibility  is  a  truncated  binomial 


(TB), 


T  P<7> 

P  (r)  =  - -  ,  r  =  1,2,... 

i  n 

l-(y) 


and  another  is  a  size  biased  binomial  (WB) 


(5.2) 


u,  ,  rO<l> 

p  (r)  ■  ~kk 


n  =  1,2,. 


(5.3) 


In  Rao  (1977),  it  was  argued  that  (5.3)  is  more  appropriate  for  the  observed 
data  than  (5.2).  Table  2  gives  the  observed  frequency  distributions  of  the 
number  of  brothers  in  families  of  different  sizes  based  on  the  data  collected 
separately  from  the  male  and  female  students  in  the  universities  at  Shanghai 
(China),  Manila  ( Phi  1 1 i pines ) ,  and  Bombay  (India),  and  the  expected  values 
on  the  hypotheses  of  TB  as  in  (5.2)  and  WB  as  in  (5.3). 


1 


ft 

* 

1 
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V 
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I 
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Table  2 

Observed  frequencies  of  the  number  of  brothers  in  families  of 
different  sizes  and  expected  frequencies  under  the  hypotheses 
of  TB  and  WB  distributions. 

(Data  from  male 

students  in  Shanghai, 

Manila  and  Bombay) 

n  =  1 

n 

=  2 

n 

=  3 

No.  of 
brothers 

observed 

expected 

TB  WB 

observed 

expected 

TB  WB 

observed 

expected 

TB  WB 

1 

6 

6 

6 

24 

28.7 

21.5 

12 

20.1 

11.7 

2 

19 

14.3 

21.5 

24 

20.1 

23. G 

3 

11 

6.7 

11.7 

TOTAL 

6 

6 

6 

43 

43.0 

43.0 

47 

46.9 

46.9 

n  =  4 

n 

=  5 

n 

=  6 

No.  of 
brothers 

observed 

expected 

TB  WB 

observed 

expected 

TB  WB 

observed 

expected 

TB  WB 

1 

8 

11.2 

5.3 

5 

6.5 

2.5 

1 

1.9 

0.6 

2 

10 

16.8 

15.7 

8 

12.9 

10.0 

4 

4.8 

3.1 

3 

17 

11.2 

15.7 

15 

12.9 

15.0 

4 

6.3 

6.3 

4 

7 

2.8 

5.3 

10 

6.5 

10.0 

9 

4.8 

6.3 

5 

2 

1.3 

2.5 

2 

1.9 

3.1 

6 

0 

0.3 

0.6 

TOTAL 

42 

42.0 

42.0 

40 

40.1 

40.0 

20 

20.0 

20.0 

It  is  seen  from  the  above  table  that  the  WB  (weighted  binomial)  provides  a 
better  fit  than  the  TB  (truncated  binomial)  indicating  that  a  family  with  r 
brothers  is  sampled  with  probability  proportional  to  r. 

Accepting  the  hypothesis  of  the  weighted  (size  biased)  binomial,  viz., 

n-1 


P(r)  ■ 


1  9  2  9  .  •  •  9(19 


(5.4) 


we  immediately  find  that 


E(r|n)  =  l  r(";])(j)n  -  ^  ■*  E(r-l)  =  (5.5) 

If  (r^n-j),  ....  (r^n^)  are  observed  data  with  B  =  r^  +  ...  +  r^, 

T  =  +  ...  +  and  S  =  T  -  B,  then  for  given  T 

E(B-k)  =  [E(n.-l)  =  I-4r^=  3=  E(S) .  (5.6) 

1  1 


C'T'  ^B+SJ  2 


k 

2 ( B+S)  * 


(5.7) 


Removing  the  expectation  signs  in  (5.6)  and  (5.7),  we  can  assert  approximate 
equalities  as  stated  in  Empirical  Theorem  1. 

During  the  last  twenty  years,  while  lecturing  to  students  and  teachers 
in  different  parts  of  the  world,  I  collected  data  on  numbers  of  brothers  and 
sisters  in  the  family  of  each  individual  in  my  audience.  The  results  are 
summarized  in  Tables  3,  4  and  5.  It  is  seen  that  the  predictions  as  given 
in  the  empirical  theorem  are  true  in  practically  every  case.  As  a  further 
test  of  the  weighted  binomial,  the  statistic 


2  -  4([B-k]  -  [(T-k)/2])2 
(T-k) 


(5.8) 


which  is  asymptotically  distributed  as  Chi-square  on  one  degree  of  freedom 
is  calculated  in  each  case.  The  Chi-squares  are  all  small  providing  evidence 
in  favor  of  the  weighted  binomial  distribution.  [Actually,  the  Chi-squares 
are  too  small  which  needs  further  study  of  the  mechanism  generating  the  observed 
data.] 

The  situation  is  slightly  different  in  Table  5  relating  to  the  data  on 


13 


professors.  The  estimated  proportion  is  more  than  half  in  each  case,  and 
the  Chi-square  values  are  high;  this  implies  that  the  weight  function 
appropriate  to  these  data  is  of  a  higher  order  than  n,  the  number  of  brothers. 
Male  professors  seem  to  come  from  families  where  sons  are  disproportionately 


more  than  the  daughters! 


Table  3.  Data  on  Male  Respondents  (Students)* 


Place  and  year 

k 

B 

S 

B 

»~k  „ 

X1 

B  +  S 

B  +  S-k 

Bangalore  (India,  75) 

55 

180 

127 

.586 

.496 

.02 

Delhi  (India,  75) 

29 

92 

66 

.582 

.490 

.07 

Calcutta  (India,  63) 

104 

414 

312 

.570 

.498 

.04 

Waltair  (India,  69) 

39 

123 

88 

.583 

.491 

.09 

Ahmedabad  (India,  75) 

29 

84 

49 

.632 

.523 

.35 

Tirupati  (India,  75) 

592 

1902 

1274 

.599 

.484 

.50 

Poona  (India,  75) 

47 

125 

65 

.658 

.545 

1.18 

Hyderabad  (India,  74) 

25 

72 

53 

.576 

.470 

.36 

Tehran  (Iran,  75) 

21 

65 

40 

.619 

.500 

.19 

Isphahan  (Iran,  75) 

11 

45 

32 

.584 

.515 

.06 

Tokyo  (Japan,  75) 

50 

90 

34 

.725 

.540 

.49 

Lima  (Peru,  82) 

38 

132 

87 

.603 

.519 

.27 

Shanghai  (China,  82) 

74 

193 

132 

.594 

.474 

.67 

Columbus  (USA,  75) 

29 

65 

52 

.556 

.409 

2.91 

College  St.  (USA,  76) 

63 

152 

90 

.628 

.497 

.01 

Total 

1206 

3734 

2501 

.600 

.503 

0.14 

‘k  =  number  of  students,  B  «  total  number  of  brothers  including  the  respondent,  S  =  total 
number  of  sisters. 


*  Estimate  of  rt  under  size  biased  binomial  distribution. 


Table  4.  Data  on  Female  Respondents  (Students) 


Place  and  year 

k 

B 

5 

B  +  S 

B+S-k 

X 

Lima  (Peru,  82) 

16 

37 

48 

.565 

.464 

.36 

Los  Banos  (Philippines,  83) 

44 

101 

139 

.579 

.485 

.18 

Manila  (Philippines,  83) 

84 

197 

281 

.588 

.500 

.00 

Bilbao  (Spain,  83) 

14 

19 

35 

.576 

.525 

.10 

Shanghai  (China.  82) 

27 

28 

55 

.662 

.500 

.00 

Table  5.  Data  on  Male  Respondents  (Professors) 

Place  and  year 

k 

B 

S 

B 

B-k 

X 1 

B  +  S 

B  +  S-k 

State  College  (USA,  75) 

28 

37 

.690 

.584 

Warsaw  (Poland,  75) 

18 

41 

21 

.660 

.525 

Poznan  (Poland,  75) 

24 

50 

17 

.746 

.567 

1.88 

Pittsburgh  (USA,  81) 

69 

169 

77 

.687 

.565 

2.99 

Tirupat''  ndia,  76) 

172 

132 

.566 

.480 

0.39 

Maracaibo  (Venezuela,  82) 

24 

95 

56 

.629 

.559 

1.77 

Richmond  (USA,  81) 

26 

57 

29 

.663 

.517 

0.03 

Total 

239 

664 

369 

.642 

.535 

3.95 

Note  1.  From  (5.7),  the  expected  value  of  the  ratio  B/(B+S)  for  given  aver¬ 
age  family  size  f  =  (B+S)/k  is  as  follows  for  different  values  of  f: 

f:  1  2  3  4  5  6 

E^B?S):  1  *75  *67  *625  *6  *58 


These  figures  show  that  in  any  given  situation  where  the  average  family  size 
is  not  likely  to  exceed  6,  the  following  predictions  can  be  made  about  the 
total  number  of  brothers  (B)  and  of  sisters  (S)  ascertained  from  the  male 
members  in  any  gathering: 

(i)  B  is  much  greater  than  S. 

(ii)  B/(B+S)  is  closer  to  0.6  or  even  2/3  rather  than  to  1/2. 


Surprisingly,  these  predictions  hold  even  if  k,  the  number  of  males  in  a  gather 
ing,  is  small.  [This  will  be  a  good  classroom  exercise  or  a  demonstration 
piece  in  any  gathering.  One  can  make  these  predictions  in  advance  and  demon¬ 
strate  the  accuracy  of  predictions  after  collecting  the  data  from  male  (or 
female)  members.] 


Note  2.  The  probabilities  for  B  >  S,  B  =  S,  B  <  S  in  the  case  of  a  weighted 


Table  6.  Probabilities  of  B  >  S,  B  =  S  and  B  <  S 


n 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

B  >  S 

1 

1 

2 

3 

4 

1 

2 

11 

16 

1 

2 

42 

64 

1 

2 

163 

256 

1 

2 

B  =  S 

0 

1 

2 

0 

3 

8 

0 

10 

32 

0 

35 

128 

0 

126 

?56 

B  <  S 

0 

0 

1 

4 

1 

8 

5 

16 

6 

32 

22 

64 

29 

128 

93 

256 

121 

512 

It  is  seen  that  P(B>S)  is  much  larger  than  P(B<S)  for  each  n  so  that  in  any 
given  audience,  the  ratio  of  b^  (males  belonging  to  families  with  B  >  S)  to 
b£  (those  with  B  >  S)  is  likely  to  be  large,  depending  on  the  distribution  of 
family  sizes.  We  may  now  state  another  empirical  theorem. 

Empirical  Theorem  2 „  The  numbers  b  and  b  are  approximately  in  the 

y  ** 

ratio  of 

E(t>g)  =  P^  +  y  P3  +  yg-  P5  +  •  •  •  +  2  (^2  +  ^4  +  ’ '  *  ^  ’  (5.9) 

to 

E(b£)  =  P3  +  g"  P4  +  •••*  (5.10) 

where  p  is  the  number  of  families  with  n  children.  In  western  audiences 
where  the  expected  family  size  is  small,  the  ratio  b^  :  b£  is  likely  to  be 
even  larger  than  4  :  1  and  in  oriental  audiences  larger  than  2:1.  which  are 
quite  high  compared  to  1  :  1.  [This  phenomenon  can  be  predicted  and  verified 
by  asking  the  members  of  an  audience  to  indicate  by  show  of  hands  how  many 
belong  to  the  category  B  >  S  and  how  many  to  B  <  S.  This  will  be  a  good 
classroom  exercise  or  a  demonstration  piece  in  any  gathering.] 

Note  3.  Let  p(b,n)  be  the  probability  that  a  family  is  of  size  N  =  n  and  the 
number  of  brothers  B  =  b,  and  suppose  that  the  probability  of  selecting  such 
a  family  is  proportional  to  b.  Then 
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nw/h  nl  -  bP(b»n)  =  bp(n)p(b|n) 
P  1  E(B)  E(B) 

Pw(n)  =  ^inlp(n). 


(5.11) 

(5.12) 


When  p(b|n)  is  binomial 

_  np(n) 

p  (n)  -  w 

E  (1)  =  _1_ 
wV  E(N) 

so  that  the  harmonic  mean  of  observations  n^, 
distribution  (5.11)  or  (5.12), 

k 

l  ~ 

L  "i 


(5.13) 


,  n^  on  Nw,  i.e.,  from  the 


(5.14) 


is  an  estimate  of  E(N)  in  the  original  population.  If  the  form  of  p(n)  is 
known,  then  one  could  write  down  the  likelihood  of  the  sample  n^ ,  ....  n^ 
using  the  probability  function  (5.12)  and  determine  the  unknown  parameters 
by  the  method  of  maximum  likelihood. 

6.  ALCOHOLISM,  FAMILY  SIZE,  AND  BIRTH  ORDER 

Smart  (1963,  1964)  and  Sprott  (1964)  examined  a  number  of  hypotheses  on 
the  incidence  of  alcoholism  in  Canadian  families  using  the  data  on  family 
size  and  birth  order  of  242  alcoholics  admitted  to  three  alcoholism  clinics 
in  Ontario.  The  method  of  sampling  is  thus  of  the  type  discussed  in  Section  5. 

One  of  the  hypotheses  tested  was  that  larger  families  contain  larger 
numbers  of  alcoholics  than  ejected.  The  null  hypothesis  that  the  number  of 
alcoholics  is  as  expected  was  interpreted  to  imply  that  the  observations  on 
family  size  as  ascertained  arise  from  the  weighted  distribution 


unurminL\numvinrM  uvm  wvv-jvwjv  mm1  v\  V  K'  p  k"!  E!  H*  Wt  W  W  g  HH  W  BE  IWV  TCT?  v  CT  g  vuuv  yj  vu  t.  vjv.  vwj  gr  w  iwjtw, 
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np(n)/E(n) ,  n  =  1,2,...,  (6.1) 

where  p(n),  n  =  1,2,...,  is  the  distribution  of  family  size  in  the  general 
population.  Smart  and  Sprott  used  the  distribution  of  family  size  as  report¬ 
ed  in  the  1931  census  of  Ontario  for  p(n)  in  their  analysis.  It  is  then  a 
simple  matter  to  test  whether  the  observed  distribution  of  family  size  in 
their  study  is  in  accordance  with  the  expected  distribution  (6.1). 

It  may  be  noted  that  the  distribution  (6.1)  would  be  appropriate  if  we 
had  chosen  individuals  (alcoholic  or  not)  at  random  from  the  general  popula¬ 
tion  (of  individuals)  and  ascertained  the  sizes  of  the  families  to  which  they 
belonged.  But  it  is  not  clear  whether  the  same  distribution  (6.1)  holds  if 
the  inquiry  is  restricted  to  alcoholic  individuals  admitted  to  a  clinic,  as 
assumed  by  Smart  and  Sprott.  This  could  happen,  as  demonstrated  below,  under 
an  interpretation  of  their  null  hypothesis  that  the  number  of  alcoholics  in 
a  family  has  a  binomial  distribution  (like  failures  in  a  sequence  of  indepen¬ 
dent  trials),  and  a  further  assumption  that  every  alcoholic  has  the  same  in¬ 
dependent  chance  of  being  admitted  to  a  clinic. 

Let  u  be  the  probability  of  an  individual  becoming  an  alcoholic,  and 
suppose  that  the  probability  that  a  member  of  a  family  becomes  an  alcoholic  is 
independent  of  whether  another  member  is  alcoholic  or  not.  Further  let  p(n), 
n  =  1,2,...,  be  the  probability  distribution  of  family  size  (whether  a  family 
has  an  alcoholic  or  not)  in  the  general  population.  Then  the  probability  that 
a  family  is  of  size  n  and  has  r  alcoholics  is 

p(n)(J),Vr,  r  =  0,...,n,  n  =  1,2 .  (6.2) 

where  <p  =  (1-tt).  From  (6.2),  it  follows  that  the  distribution  of  family  size 
in  the  general  population,  given  that  a  family  has  at  least  one  alcoholic,  is 


(l-»n)p(n) 
1  -  E(<frn) 


n  =  1 ,2 . 


(6.3) 


If  we  had  chosen  households  at  random  and  recorded  the  family  sizes  in  house¬ 
holds  containing  at  least  one  alcoholic,  then  the  null  hypothesis  on  the  ex¬ 
cess  of  alcoholics  in  larger  families  could  be  tested  by  comparing  the  observed 
frequencies  with  the  expected  frequencies  under  the  model  (6.3).  However, 
under  the  sampling  scheme  adopted  of  ascertaining  the  values  of  n  and  r  from  an 
alcoholic  admitted  to  a  clinic,  the  weighted  distribution  of  (n,r). 


P“(n,r)  -  n»(n)(J)^f . 


(6.4) 


is  more  appropriate.  If  we  had  information  on  the  family  size  n  as  well  as  on 
the  number  of  alcoholics  (r)  in  the  family,  we  could  have  compared  the  observed 
joint  frequencies  of  (n,r)  with  those  expected  under  the  model  (6,4). 


From  (6.4),  the  marginal  distribution  of  n  alone  is 
np(n)/E(n) ,  n  =  1 ,2,... , 


(6.5) 


which  is  used  by  Smart  and  Sprott  as  a  model  for  the  observed  frequencies  of 
family  sizes.  It  is  shown  in  (6.3)  that  in  the  general  population,  the  dis¬ 
tribution  of  family  size  in  families  with  at  least  one  alcoholic  is 

(1 -^n)p(n)  ^ 

1  -  E(<t>n) 

which  reduces  to  (6.5)  if  $  is  close  to  unity.  In  other  words,  if  the  proba¬ 
bility  of  an  individual  becoming  an  alcoholic  is  small,  then  the  distribution 
of  family  size  as  ascertained  is  close  to  the  distribution  of  family  size  in 
families  with  at  least  one  alcoholic  in  the  general  population.  This  is  not 
true  if  <J>  is  not  close  to  unity. 

Smart  and  Sprott  found  that  the  distribution  (6.5)  did  not  fit  the  observed 
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frequencies,  which  had  heavier  tails.  They  concluded  that  larger  families 
contribute  more  than  their  expected  share  of  alcoholics.  Is  this  a  valid 
conclusion?  It  is  seen  that  the  weighted  distribution  (6.5)  is  derived  under 
two  hypotheses.  One  is  that  the  distribution  of  family  size  in  the  subset  of 
families  having  at  least  one  alcoholic  in  the  general  population  is  of  the 
form  (6.3)  which  is  implied  by  the  original  null  hypothesis  posed  by  Smart. 

The  other  is  that  the  method  of  ascertainment  is  equivalent  to  p.p.s.  sampling 
of  families,  with  probability  proportional  to  the  number  of  alcoholics  in  a 
family.  The  rejection  of  (6.5)  would  imply  the  rejection  of  the  first  of 
these  two  hypotheses  if  the  second  is  assumed  to  be  correct.  There  are  no 
a  priori  grounds  for  such  an  assumption,  and  in  the  absence  of  an  objective 
test  for  this,  some  caution  is  needed  in  accepting  Smart's  conclusions. 

Another  hypothesis  considered  by  Smart  was  that  the  later-born  children 
have  a  greater  tendency  to  become  alcoholic  than  the  earl ier-born.  The 
method  used  by  Smart  may  be  somewhat  confusing  to  statisticians.  Some  comments 
were  made  by  Sprott  criticizing  Smart's  approach.  We  shall  review  Smart's 
analysis  in  the  light  of  the  model  (6.4).  If  we  assume  that  birth  order  has 
no  relationship  to  becoming  an  alcoholic,  and  the  probability  of  an  alcoholic 
being  referred  to  a  clinic  is  independent  of  the  birth  order,  then  the  proba¬ 
bility  that  an  observed  alcoholic  belongs  to  a  family  with  n  children  and  r 
alcoholics  and  has  given  birth  order  s  ^n  is,  using  the  model  (6.4), 

nE'I'n] s  =  1 . n>  r  =  1»-..,n,  n  =  1,2,...  .  (6.6) 

Summing  over  r,  we  find  that  the  marginal  distribution  of  (n,s),  the  family 
size  and  birth  order,  applicable  to  the  observed  distribution,  is 


p(n)/E(n) ,  s  =  1 ,. . .  ,n,  n  =  1,2,..., 


(6.7) 


where  it  may  be  recalled  that  p(n),  n  =  1,2,...,  is  the  distribution  of  family 
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size  in  the  general  population.  Smart  gave  the  observed  bivariate  freqen- 
cies  of  (n,s),  and  since  p(n)  was  known,  the  expected  v-lues  could  have 
been  computed  and  compared  with  the  observed.  But,  he  did  something  else. 

From  (6.7),  the  marginal  distribution  of  birth  rank  is 

ftVt  l  PW*  r  =  •  (6-8) 

'  '  i=r 

Smart's  (1963)  analysis  in  his  Table  2  is  an  attempt  to  compare  the  observed 
distribution  of  birth  ranks  with  tne  expected  under  the  model  (6.3)  with  p(i) 
itself  estimated  from  data  using  the  model  (6.1). 

A  better  method  is  as  follows:  from  (6.7)  it  is  seen  that  for  given  family 
size,  the  expected  birth  order  frequencies  are  equal  as  computed  by  Smart  (1963) 
in  Table  1,  in  which  case  individual  Chi-squares  comparing  the  expected  and 
observed  frequencies  for  each  family  size  would  provide  all  the  information 
about  the  hypothesis  under  test.  Such  a  procedure  would  be  independent  of  any 
knowledge  of  p(n).  But  it  is  not  clear  whether  a  hypothesis  of  the  type  posed 
by  Smart  can  be  tested  on  the  basis  of  the  available  data  without  further  in¬ 
formation  on  the  other  alcoholics  in  the  family,  such  as  their  age,  sex,  etc. 

Table  6  reproduces  a  portion  of  Table  1  in  Smart  (1963)  relating  to  fam¬ 
ilies  up  to  size  4  and  birth  ranks  up  to  4.  It  is  seen  tlvt  for  family  sizes 
2  and  3,  the  observed  frequencies  seem  to  contradict  the  hypothesis,  and  for 
family  sizes  above  3  (see  Smart's  Table  1),  birth  rank  does  not  have  any 
effect.  It  is  interesting  to  compare  the  above  data  with  a  similar  type  of 
data  (Table  7)  collected  by  the  author  on  birth  rank  and  family  size  of  the 
staff  members  in  two  departments  at  the  University  of  Pittsburgh.  It  appears 
that  there  are  too  many  earl ier-borns  among  the  staff  members,  indicating 
that  becoming  a  professor  is  an  affliction  of  the  earlier  born!  It  is  ex¬ 
pected  that  in  data  of  the  kind  we  are  considering  there  will  be  an  excess 
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of  the  earlier  born  without  implying  an  implicit  relationship  between  birth 
order  and  a  particular  attribute,  especially  when  it  is  age  dependent. 


Table  6.  Distribution  of  Birth  Rank  s  and  Family  Size  n * 


1  21  21  22 

2  10 

3 

4 

6 

6 

17  1 

14  1 

9  I 

3.3  11 

3.3  10 

3.3  13 

13 

| 

O  =  observed,  E  =  expected. 

Table  7.  Distribution  of  Birth  Rank  s  and  Family  Size 
Members  (University  of  Pittsburgh) 

n  <  4  Among  Staff 

j  n  =  1  2 

3  4 

1  7  14 

9  6 

2  6 

4  2 

3 

2  0 

4 

0 

7.  WAITING  TIME  PARADOX 

Patil  (1984)  reported  a  study  conducted  in  1966  by  the  Institute  National 
de  la  Statistique  el  de  l'Economie  Appliquee  in  Morocco  to  estimate  the  mean 
sojourn  time  of  tourists.  Two  types  of  surveys  were  conducted,  one  by  contact¬ 
ing  tourists  residing  in  hotels  and  another  by  contactinq  tourists  at  frontier 
stations  while  leaving  the  country.  The  mean  sojourn  time  as  reported  by  3,000 
tourists  in  hotels  was  17.8  days,  and  by  12,321  tourists  at  frontier  stations 
was  9.0.  Suspected  by  the  officials  in  the  department  of  planning,  the  esti¬ 
mate  from  the  hotels  was  discarded. 

It  is  clear  that  the  observations  collected  from  tourists  while  leaving 
the  country  correspond  to  the  true  distribution  of  sojourn  time,  so  that  the 
observed  average  9.0  is  a  valid  estimate  of  the  mean  sojourn  time.  It  can  be 
shown  that  in  a  steady  state  of  flow  of  tourists,  the  sojourn  time  as  report- 
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ed  by  those  contacted  at  hotels  has  a  size  biased  distribution,  so  that  the 
observed  average  will  be  an  overestimate  of  the  mean  sojourn  time.  If  Xw 
is  a  size  biased  random  variable  (r.v.),  then 


E(xV  = 


(7.1) 


where  y  is  the  expected  value  of  X,  the  original  variable.  The  formula  (7.1) 
shows  that  the  harmonic  mean  of  the  size  biased  observations  is  a  valid  esti¬ 
mate  of  y.  Thus  the  harmonic  mean  of  the  observations  from  the  tourists  in 
hotels  would  have  provided  an  estimate  comparable  with  the  arithmetic  mean 
of  the  observations  from  the  tourists  at  the  frontier  stations. 

It  is  interesting  to  note  that  the  estimate  from  hotel  residents  is 
nearly  twice  the  other,  a  factor  which  occurs  in  the  waiting  time  paradox 
(see  Feller,  1966;  Patil  and  Rao,  1977)  associated  with  the  exponential  dis¬ 
tribution.  This  suggests,  but  does  not  confirm,  that  sojourn  time  distribu¬ 
tion  may  be  exponential. 

Suppose  that  the  tourists  at  hotels  were  asked  how  long  they  had  been 
staying  in  the  country  up  to  the  time  of  inquiry.  In  such  a  case,  we  may 
assume  that  the  p.d.f.  of  the  r.v.  Y,  the  time  a  tourist  has  been  in  a  country 
up  to  the  time  of  inquiry,  is  the  same  as  that  of  the  product  XWR,  where  Xw 
is  the  size  biased  version  of  X,  the  sojourn  time,  and  R  is  an  independent  r.v. 
with  a  uniform  distribution  on  [0,1].  If  F(x)  is  the  distribution  function 


of  X,  the  the  p.d.f.  of  Y  is 


p  n-F(y)J. 


(7.2) 


The  parameter  y  can  be  estimated  on  the  basis  of  observations  on  Y,  provided 
the  functional  form  of  F(y),  the  distribution  of  the  sojourn  time,  is  known. 

It  is  interesting  to  note  that  the  p.d.f.  (7.2)  is  the  same  as  that 
obtained  by  Cox  (1962)  in  studying  the  distribution  of  failure  times  of  a 
component  used  in  different  machines  from  observations  of  the  ages  of  the 
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components  in  use  at  the  time  of  investigation. 


8.  DAMAGE  MODELS 


Let  N  be  a  r.v.  with  probability  distribution,  pn,  n  =  1,2,...,  and  R 
be  a  r.v.  such  that 


P(R=  r  J  N  =  n)  =  s(r,n). 

Then  the  marginal  distribution  of  R  truncated  at  zero  is 


p'  ■  0-p)'1  l  P  s{r,n),  r  =  1,2, 

•  n  — 


(8.1) 


(8.2) 


where 


P  =  l  Pis(0,i), 


(8.3) 


The  observation  r  represents  the  number  surviving  when  the  original  observa¬ 
tion  n  is  subject  to  a  destructive  process  which  reduces  n  to  r  with  proba¬ 


bility  ? ' 


Such  a  situation  arises  when  we  consider  observations  on  family 


size  coi,  ng  only  the  surviving  children  (R).  The  problem  is  to  determine 
the  distribution  of  N,  the  original  family  size,  knowing  the  distribution  of  R 
and  assuming  a  suitable  survival  distribution. 

Suppose  that  N-P(x),  i.e.,  distributed  as  Poisson  with  parameter  a,  and 
let  R-B(.,tt),  i.e.  binomial  with  parameter  w.  Then 


„ i  _  „-At r  (^W 

p  _  e  ■  ‘  r~  9 

r  r! ( 1  -  e  X7r) 


r  =  1,2 . 


(8.4) 


It  is  seen  that  the  parameters  a  and  *  get  confounded,  so  that  knowing  the 
distribution  of  R,  we  cannot  find  the  distribution  of  N.  Similar  confounding 
occurs  when  N  follows  a  binomial,  negative  binomial,  or  logarithm  series  dis¬ 
tribution.  When  the  survival  distribution  is  binomial,  Sprott  (1965)  gives 
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a  general  class  of  distributions  which  has  this  property.  What  additional 
information  is  needed  to  recover  the  original  distribution?  For  instance, 
if  we  know  which  of  the  observations  in  the  sample  did  not  suffer  damage, 
then  it  is  possible  to  estimate  the  original  distribution  as  well  as  the 
binomial  parameter  tt. 

It  is  interesting  to  note  that  observations  which  do  not  suffer  any 


damage  have  the  distribution 


u  r 

Pr  =  CPrTT  i 


(8.5) 


which  is  a  weighted  distribution.  If  the  original  distribution  is  Poisson, 


_U  _  .-Air  (Air) 

p  =  e  - - 


r!  (1  -  e~A1T) 


-\v\ 


(8.6) 


which  is  the  same  as  (8.4).  It  is  shown  in  Rao  and  Rubin  (1964)  that  the 
equality  pjl  =  p(',  characterizes  the  Poisson  distribution. 

The  damage  models  of  the  type  described  above  were  introduced  in  Rao 
(1965).  For  theoretical  developments  on  damage  models  and  characterization 
of  probability  distributions  arising  out  of  their  study,  the  reader  is  re¬ 


ferred  to  Alzaid,  Rao  and  Shanbhag  (1984), 


9.  QUADRAT  SAMPLING  WITH  VISIBILITY  BIAS 

For  the  purpose  of  estimating  wildlife  population  density,  quadrat 
sampling  has  been  found  generally  preferable.  Quadrat  sampling  is  carried 
out  by  first  selecting  at  random  a  number  of  quadrats  of  fixed  size  from 
the  region  under  study  and  ascertaining  the  number  of  animals  in  each. 
Following  Cook  and  Martin  (1974)  we  make  the  assumptions  as  given  below: 
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A-| :  Animals  occur  in  groups  within  each  quadrat  and  the  number  of  groups 
within  a  quadrat  has  a  specified  distribution. 

Ag:  The  number  of  animals  in  a  group  has  a  specified  distribution. 

A^:  The  number  of  groups  within  a  quadrat  and  the  number  of  animals 
within  the  groups  are  all  independently  distributed. 

A^:  The  method  of  sampling  is  such  that  the  probability  of  sighting 
(recording)  a  group  of  x  animals  is  w(x). 

Let  X  and  Xw  be  the  r.v.'s  representing  the  number  of  animals  in  a  group 
in  the  population  and  as  ascertained.  Similarly,  let  N  and  Nw  be  the  r.v.'s 
for  the  number  of  groups  within  a  quadrat.  It  is  clear  that  since  the  method 
of  ascertainment  does  not  give  equal  chance  of  selection  to  groups  of  all 
sizes  (unless  w(x)  is  constant),  the  r.v.'s  X  and  Xw  do  not  have  the  same 
distribution,  and  so  is  the  case  with  N  and  Nw.  The  following  theorem  pro¬ 
vides  the  basic  results  in  quadrat  sampling  theory. 

THEOREM.  Under  the  assumptions  A^A^  we  have  the  following  results. 

(i)  P(NW  =  m|  N  =  n)  =  (^"(l-*)"*" 

where 

oo 

«  =  l  w(x)P(X =  x) 

1 

is  the  visibility  factor  (the  probability  of  recording  a  group). 

(ii)  P(NW  =  m)  =  l  (%m(l-w)n'mP(N  =  n), 

n=m 

i.e.,  the  visibility  bias  induces  an  additive  damage  model  on  the  true  quadrat 
frequency  with  binomial  survival  distribution  (see  Rao  1965). 

(iii)  The  probability  that  m  observed  groups  in  a  quadrat  have  x-j ,  . . . ,  xm 


animals  is 
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P(x'J  =  x1,...,xJJ=xm|NW  =  m)  =  n  P(X“=  x.) 
where  it  may  be  noted, 

P(XW  =  x)  =  w(x)P(X  =  x)/w. 

(iv)  Let  Sw  =  X?  +  ...  +  x".  Then 
I  m 


P(Xw  =  y)  =  l  P(NW  =  m)P(SW  =  y| m) 
m=l 


w(x,)  w(x  } 

P(Sw  =  y|m)  -  I 


£X  •  =y 


—  P(X,  =  x,)  ...  P ( X  =  x  ) 
a)  '  1  l  mm 


Proof.  Under  the  assumptions  and  notations  used,  we  have  the  basic 
probability  equation 


(9.2)  over  n  from  m  to  «,  we  have 


P(NW  =  m,  x!J  =  x1,...,xJ=xm)  =  P(NW  =  m)  P(xJ  =  Xj.), 


from  which  the  result  (iv)  follows. 


(9.3) 
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P(N  =  n,  NW  =  m,  X*  =  , . . .  ,xj  =  xm>  Xm+1  =  *m+r  ...,  Xn  =  xn) 

•J 

m  n 

=  P(N  =  n}(")  n  P(X  •  =  x -)w(x •)  n  [l-w(x  -)]P(X  ■  =  x  •). 
m  j=1  J  J  J  j=m+1  J  J  J 

(9.1) 

■ 

From  (9.1)  summing  out  Xm+^ ,  ....  Xp  we  have 

Bp 

P(N  =  n ,  Nw  =  m,  X^  =  X] , . . .  .xJJ  «  xj 

m 

=  P(N  =  n)(>m(l-o))n‘m  n  P(X,;  =  xi). 
m  j=1  J  0 

(9.2) 

■wT 

Then  the  results  (i),  (ii)  and  (iii)  of  the  theorem  follow  from  (9.2). 

Summing 

m 

R 

» 
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Note  1.  The  expression  (9.3)  enables  us  to  write  down  the  joint  likelihood 
of  the  numbers  of  groups  observed  in  different  quadrats  and  the  numbers  of 
animals  observed  in  all  the  groups  sighted.  Thus,  if  m^ ,  . . . ,  m^  are  the 
numbers  of  groups  in  k  quadrats  and  x..  is  the  number  of  animals  in  the  j-th 
quadrat,  the  joint  like! i hod  is  the  product  of 

k 

n  P(Nw  =  m.)  (9.4) 

i  =  l  1 

and 

k  mi 

n  n  P ( •  =  x . . ) .  (9.5) 

i=l  j=l  1J  1J 

Results  ( i i )  and  (iii)  of  the  theorem  give  the  methods  of  computing  the  in¬ 
dividual  terms  in  (9.4)  and  (9.5)  from  the  population  distributions  of  N  and 
X  and  the  weight  function  w(x).  In  general,  the  unknown  parameters  are  those 
occurring  in  the  specified  distributions  of  N  and  X  and  the  additional  visi¬ 
bility  factor  oj  (or  p  the  probability  of  sighting  an  animal).  All  these 
could  be  estimated  using  the  product  of  (9.4)  and  (9.5)  as  the  likelihood 
function. 

Note  2.  Cook  and  Martin  (1974)  consider  the  special  case  where 

N’Pq(a),  Poisson  with  parameter  A,  (9.6) 


from  the  likelihood  (9.4)  and  to,  e  from  (9.5).  Cook  and  Martin  (1974) 
provided  the  necessary  computations  in  such  a  case,  choosing  w(x)  as  in 
(9.8). 

w 

If  N  is  not  a  Poisson  variable,  then  the  distribution  of  N  involves 
to  as  an  additional  parameter  (see  Rao  (1965)  and  Sprott  (1965)),  in  which  case 
the  product  of  (9.4)  and  (9.5)  provide  the  joint  likelihood  for  the  estima¬ 
tion  of  all  the  unknown  parameters. 

In  the  special  case  where  N  and  X  are  as  distributed  in  (9.6)  and  (9.7) 
respectively  and  w(x)  =  Bx  (i.e.,  when  a  group  is  observed  if  and  only  if 
all  the  animals  are  sighted), 

Nw  ~  Pq ( 6 ) ,  6  =  Ao>  and  Xw  -  ax<j>x/gU) ,  *  =  ee 

so  that  the  parameters  a,  e  and  8  are  confounded  and  are  not  individually 
estimable. 


10.  THE  STORY  OF  BROKEN  BONES 

The  following  problem  arose  in  the  analysis  of  measurements  on  femur 
bones  recovered  from  an  ancient  graveyard.  When  a  femur  bone  was  found 
intact  it  was  possible  to  take  three  measurements,  length  L,  breadth  of  the 
top  tip  B  and  breadth  of  the  bottom  tip  T.  But  when  a  broken  piece  was 
found,  either  the  measurement  B  or  the  measurement  T  could  be  taken.  Thus, 
the  observed  data  was  incomplete  with  either  the  measurement  B  alone  or  T 
alone  on  some  and  all  three  L,  B,  T  on  others.  How  does  one  estimate  from 
the  fragmentary  data  of  the  above  type  the  mean  values  and  second  order 
moments  of  L,  B,  T  in  the  original  population  of  femur  bones? 

Let  p(2,b,t)  be  the  p.d.f.  of  L,  B,  T  in  the  original  population  with 
the  associated  marginal  densities 


p  ( b )  =  p(£,b,t)d£dt  and  p(t)  =  p(t,b,t)d£db.  (10.1) 

If  the  probability  that  a  bone  gets  broken  does  not  depend  on  its  dimensions, 
then  the  likelihood  of  the  observed  data  could  be  written  down  using  the 
p.d.f.’s,  p(t,b,t),  p(b)  and  p(t),  depending  on  the  available  measurements 
on  each  specimen.  However,  it  may  happen  that  the  longer  bones  have  a  greater 
chance  of  being  broken;  such  a  phenomenon  was  demonstrated  in  a  similar  situ¬ 
ation  on  skull  measurements  by  Rao  and  Shaw  (1948)  and  Rao  (1978).  In  such 
a  case  we  may  have  to  distinguish  the  measurements  Ls,  Bs,  Ts  taken  on  well 
preserved  (surviving)  bones  and  measurements  Ld,  Bd,  Td  associated  with  the 
damaged  bones  and  denote  their  p.d.f.'s  with  superfixes  s  and  d  respectively. 

We  suppose  that  the  chance  of  survival  of  a  femur  bone  of  length  i  is 


s(«.)  depending  only  on  Then 


Similarly 


psU,b,t)  =  o“1pU,b,t)s(jl) ,  o  =  E[sU)]. 


pdU,b,t)  =  (l-o)”^p(fc,b,t)  (l-s(t)) 


(10.2) 


(10.3) 


From  (10.2)  and  (10.3),  the  following  are  immediately  deduced: 
psU)  =  a_1p(Jl)s(£) , 
pS  ( b ,  1 1  £ )  =  p  ( b , 1 1  £ ) , 

pS(b,t)  =  CT_1p(b,t|£)p(£)s(£)d£ 

=  a_1p(b,t)pU|b,t)sU)d£  =  p(b,t)w(b,t) , 

pd (b , 1 1  £)  =  p ( b , 1 1  £ ) , 
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ps  or  d(b)  =  p(b)  and  ps  or  d(t)  =  p(t), 

Pduib,t)  =  p,(t-b’Vl;'s/\V  *p(»M>- 

/p(4,b,t)(l-s(£))d£ 

It  is  interesting  to  note  that  all  distributions  involving  L  as  a  main 
variate  are  weighted.  One  casualty  of  this  result  is  that  the  regression 
of  l  on  (B,T)  estimated  from  the  complete  sets  of  samples  on  L,  B,  T  does 
not  correspond  to  the  true  regression  of  L  on  (B,T)  in  the  original  popula¬ 


tion  of  femur  bones.  But  others  like 


ps(b,t  |  s.) ,  pd(b,t|t),  pSOrd(b),  pS  or  d(t) 


(10.4) 


are  independent  of  s[z) ,  and  the  properties  of  these  distributions  could  be 
used  to  estimate  all  the  unknown  parameters  when  s(si)  is  unknown. 

For  instance  using  all  the  available  measurements  on  B  and  T  (both  on 
damaged  and  well  preserved  bones),  the  mean  values  pg  and  p-j.  of  B  and  T  in 
the  original  population  could  be  estimated  by  the  usual  averages.  From  the 
observations  on  the  complete  set  of  L,  B  and  T  we  can  estimate  the  regressions 
of  B  on  L  and  T  on  L  in  the  usual  way.  Then  the  missing  values  of  L  can  be 
estimated  in  each  case,  i.e.,  where  B  alone  or  T  alone  is  available,  by 
inverse  regression  using  the  regression  equation  of  B  on  L  or  T  on  L.  Now 
the  average  of  the  observed  values  of  L  and  the  estimated  values  of  L  in 
missing  cases  is  computed  as  an  estimate  of  p^,  the  mean  value  of  L  in  the 
original  population.  In  a  similar  manner  the  second  order  moments  can  be 
estimated  using  the  relationships  between  the  parameters  of  the  original 
distribution  of  L,  B  and  T  and  of  the  conditional  distributions  (10.4). 
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13.  CALCUTTA  BLACKOUT  DISTRIBUTION 

Suppose  that  we  are  conducting  an  experiment  to  measure  the  time  taken 
for  a  certain  event  to  happen,  and  for  running  the  experiment  a  continuous 
supply  of  electric  power  is  needed.  If  the  power  supply  is  cut  off  before 
the  event  happens,  then  the  experiment  has  to  be  abandoned  and  no  observa¬ 
tion  gets  recorded.  What  distribution  do  the  recorded  observations  result¬ 
ing  only  from  the  successful  experiments  (when  the  power  supply  is  on  until 
the  event  occurred)  obey? 

Let  f(x)  be  the  p.d.f.  of  X,  the  time  taken  for  an  event  to  happen,  and 
g(t)  be  the  p.d.f.  of  T,  the  time  at  which  the  electric  supply  may  fail  (in 
Calcutta  this  is  a  random  phenomenon  producing  a  blackout).  An  observation 
on  X  gets  recorded  only  when  a  pair  (x,t)  occurs  such  that  x  <  t.  The  p.d.f. 


of  a  pair  (X,T)  such  that  X  £  T  is 

f (x)g(t) 

P(X<T) 

(rl 

so  that  the  p.d.f.  of  the  recorded  variable  Xv  '  is 


(11.1) 


f(r)/..\  _  [°°  f(x)g(t)  _  f (x) (l-G(x)N 
f  (X)  "  L  T(X£Trdt  P(X  <T)  ’ 


(11.2) 


where  G(t)  is  the  distribution  function  of  T.  The  distribution  (11.2)  is  a 
weighted  version  of  the  distribution  of  X,  which  I  termed  as  the  Calcutta 
Blackout  Distribution  (CBD). 

If  we  have  observations  from  successful  experiments  alone,  then  the 
relevant  distribution  is  (11.2).  However,  in  such  a  situation  other  observa¬ 
tions  could  be  made.  The  appropriate  distributions  when  additional  information 
is  available  are  discussed  below. 

If  we  define  a  random  variable  Z  =  min(X,T),  then  it  is  observable  in 
each  experiment.  The  p.d.f.  of  Z  is 
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(11.3) 


h(z)=-i[  f(x)g(t)dxdt  =  [1  -  F(z)]g(z)  +  [1  -  G(z)]f(z) 

)z  )z 

which  is  a  mixture  of  weighted  distributions. 

In  the  experiment  described  above,  there  is  also  the  possibility  of 

recording  Z*  =  min(X,T)  with  the  identifying  symbol  whether  the  true  obser¬ 
vation  is  on  X  or  T.  In  such  a  case  the  p.d.f.  of  Z*  is 

f  (z)  (l-G(z))  if  z  is  an  observation  on  X, 
h*(z)  f  (11.6) 

g(z)yl-F(z)J  if  z  is  an  observation  on  T. 

12.  CLOUDED  DISTRIBUTIONS 

When  the  sea-surface  temperature  is  measured  by  a  satellite,  there  is 
a  possibility  that  the  reading  is  effected  by  a  cloud  cover  resulting  in 
reduced  values  of  temperatures.  The  amount  by  which  a  measurement  is  scaled 
down  depends  on  the  thickness  of  the  cloud.  But  when  a  large  number  of 
measurements  are  taken  in  a  given  area,  there  will  be  a  proportion  of  data 
which  is  free  from  cloud  contamination  while  the  rest  are  effected  by  clouds 
of  different  thickness.  If  p(x)  is  the  true  distribution  of  the  sea-surface 
temperatures,  whose  average  we  are  seeking,  q(c),  0  £  c  £  1 ,  is  the  p.d.f. 
of  cloudiness  in  the  area  under  cloud  cover  and  A  is  the  proportion  of  the 
area  without  cloud  cover,  then  the  p.d.f.  relevant  to  the  observed  tempera¬ 
tures  is 

Ap(t)  +  (1-a)  ^-p(£)q(c)dc.  (12.1) 

The  proportion  A  and  the  p.d.f.  q ( c )  are  generally  unknown  in  any  given  situ¬ 
ation,  and  modelling  the  entire  data  for  the  unknown  elements  is  extremely 
difficult.  However,  when  A  is  large,  the  distribution  (12.1)  is  dominated 
by  p(t)  in  the  right  tail,  and  this  can  be  judged  by  the  smoothness  of  the 
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histogram  of  the  observed  data  relating  to  large  values  of  the  temperature. 
When  this  happens,  we  can  consider  the  data  in  the  tails  of  the  histogram 
as  uncontaminated  observations  and  use  them  only  in  estimating  the  mean 
sea-surface  temperature.  Such  a  technique  was  used  by  Smith,  Rao,  Koffler 
and  Curtis  (1970).  They  assumed  that  the  temperature  distribution  is  normal 
(with  mean  p  and  variance  a  )  and  an  estimate  of  a  is  available  from  a 
different  source,  and  equated  the  observed  (estimate)  point  of  inflexion  in 
the  right  tail  of  the  smoothed  histogram  to  p  +  o,  which  provided  an  estimate 
of  p.  An  alternative  method  is  to  consider  a  truncation  point  x  and  estimate 
the  mean  using  only  the  observations  which  are  equal  to  or  exceeding  x.  The 
estimate  of  p  in  such  a  case  satisfies  the  equation 


aZ{Iir) 

t  =  p  + - 

T  i  -  *(*?4 


(12.2) 


where  i  is  the  average  of  observations  greater  than  or  equal  to  x.  We  denote 

the  solution  of  (12.2)  by  p^ .  Then  we  draw  the  graph  of  p^  against  x  and 

choose  that  value  of  x,  say  xg,  from  where  the  graph  shows  a  tendency  to  be 

parallel  to  the  x  axis.  The  estimate  of  p  is  taken  as  p  . 

T0 
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